-
Notifications
You must be signed in to change notification settings - Fork 101
feat: Added calibration curve with multi-class classification support #1764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
In terms of API, I think that we will want to have the calibration display in another accessor than metric. It is more of a diagnosis tool so we need to think where to add it. I'm also thinking that we should enlarge the scope of the tool potentially and call it a reliability diagram that would cover the classification and regression case. Also I did not yet go into the code but we need to think twice before implementing the multiclass case. Use a one vs rest approach might not be ideal since some output of classifiers are not using this strategy as a prediction function. I was looking at https://arxiv.org/pdf/2112.10327 and https://arxiv.org/abs/2210.16315. It seems that you have several trade-off and we should look at it thoroughly. |
Sure, to be fair I was just building upon the proposed implementation in https://github.com/probabl-ai/skore/pull/1315/files, but these architectural decisions are indeed crucial for feature implementation. |
|
Yep, no worries. It was to put some first thoughts on my side such that we don't forget it. We could always restrict the scope to binary classification and regression because it is better defined and then check the multiclass one if it is the integration. |
|
[automated comment] Please update your PR with main, so that the |
Closes #1316, Waiting on iteration after merge conflicts that will arise from #1669, #1701, #1709, etc.
Added CalibrationCurveDisplay class, with visualization for classifier probability calibration assessment. Implementation follows scikit-learn's API patterns (ref) while extending functionality to support both binary and multiclass classification.
Summary:
Testing
Added tests to cover binary classification scenarios.
Added tests to cover multi-class classification scenarios.
TODO:
Example file calibration plots: